Overview
Brought to you by YData
Dataset statistics
| Number of variables | 26 |
|---|---|
| Number of observations | 101693 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 20.2 MiB |
| Average record size in memory | 208.0 B |
Variable types
| Numeric | 19 |
|---|---|
| Categorical | 4 |
| Text | 2 |
| DateTime | 1 |
Average_Rating is highly overall correlated with Hotel_Name and 6 other fields | High correlation |
Crawled_date is highly overall correlated with Hotel_Name and 1 other fields | High correlation |
FOG Index is highly overall correlated with Flesch Reading Ease | High correlation |
Flesch Reading Ease is highly overall correlated with FOG Index | High correlation |
Hotel_Name is highly overall correlated with Average_Rating and 10 other fields | High correlation |
Num_of_Ratings is highly overall correlated with Hotel_Name and 1 other fields | High correlation |
Unnamed: 0 is highly overall correlated with Hotel_Name and 1 other fields | High correlation |
breadth is highly overall correlated with depth and 1 other fields | High correlation |
cleanliness_score is highly overall correlated with Average_Rating and 6 other fields | High correlation |
comfort_score is highly overall correlated with Average_Rating and 5 other fields | High correlation |
depth is highly overall correlated with breadth | High correlation |
employee_friendliness_score is highly overall correlated with Average_Rating and 5 other fields | High correlation |
facility_score is highly overall correlated with Average_Rating and 6 other fields | High correlation |
hotel_grade is highly overall correlated with Average_Rating and 5 other fields | High correlation |
location_score is highly overall correlated with Hotel_Name | High correlation |
text_length is highly overall correlated with breadth | High correlation |
value_for_money_score is highly overall correlated with Average_Rating and 6 other fields | High correlation |
is_photo is highly imbalanced (70.6%) | Imbalance |
Crawled_date is highly imbalanced (83.3%) | Imbalance |
Unnamed: 0 is uniformly distributed | Uniform |
Unnamed: 0 has unique values | Unique |
Review_Text has unique values | Unique |
Helpfulness has 92031 (90.5%) zeros | Zeros |
Deviation of star ratings has 2537 (2.5%) zeros | Zeros |
Reproduction
| Analysis started | 2025-02-05 07:49:19.917456 |
|---|---|
| Analysis finished | 2025-02-05 07:49:58.302897 |
| Duration | 38.39 seconds |
| Software version | ydata-profiling vv4.12.1 |
| Download configuration | config.json |
Variables
Unnamed: 0
Real number (ℝ)
High correlation  Uniform  Unique 
| Distinct | 101693 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 51631.052 |
| Minimum | 0 |
|---|---|
| Maximum | 103563 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5122.6 |
| Q1 | 25688 |
| median | 51615 |
| Q3 | 77443 |
| 95-th percentile | 98361.4 |
| Maximum | 103563 |
| Range | 103563 |
| Interquartile range (IQR) | 51755 |
Descriptive statistics
| Standard deviation | 29900.729 |
|---|---|
| Coefficient of variation (CV) | 0.57912299 |
| Kurtosis | -1.1999338 |
| Mean | 51631.052 |
| Median Absolute Deviation (MAD) | 25878 |
| Skewness | 0.0043273899 |
| Sum | 5.2505166 × 109 |
| Variance | 8.9405362 × 108 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 68874 | 1 | < 0.1% |
| 68872 | 1 | < 0.1% |
| 68871 | 1 | < 0.1% |
| 68870 | 1 | < 0.1% |
| 68869 | 1 | < 0.1% |
| 68868 | 1 | < 0.1% |
| 68867 | 1 | < 0.1% |
| 68866 | 1 | < 0.1% |
| 68865 | 1 | < 0.1% |
| Other values (101683) | 101683 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 |
| Value | Count | Frequency (%) |
| 103563 | 1 | |
| 103562 | 1 | |
| 103561 | 1 | |
| 103560 | 1 | |
| 103559 | 1 | |
| 103558 | 1 | |
| 103557 | 1 | |
| 103556 | 1 | |
| 103555 | 1 | |
| 103553 | 1 |
Hotel_Name
Categorical
High correlation 
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 794.6 KiB |
| zedwell-trocaderor | 4937 |
|---|---|
| lancaster-gate | 4887 |
| thistletower | 4878 |
| stgileshotel | 4866 |
| z-trafalgar | 4655 |
| Other values (28) |
Length
| Max length | 35 |
|---|---|
| Median length | 22 |
| Mean length | 15.721112 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | studios2let |
|---|---|
| 2nd row | studios2let |
| 3rd row | studios2let |
| 4th row | studios2let |
| 5th row | studios2let |
Common Values
| Value | Count | Frequency (%) |
| zedwell-trocaderor | 4937 | 4.9% |
| lancaster-gate | 4887 | 4.8% |
| thistletower | 4878 | 4.8% |
| stgileshotel | 4866 | 4.8% |
| z-trafalgar | 4655 | 4.6% |
| nyx-hotel-london-by-leonardo-hotels | 4487 | 4.4% |
| radissonblugrafton | 3675 | 3.6% |
| cityinnwestminster | 3630 | 3.6% |
| sidneyhotel | 3510 | 3.5% |
| ace | 3476 | 3.4% |
| Other values (23) | 58692 |
Length
| Value | Count | Frequency (%) |
| zedwell-trocaderor | 4937 | 4.9% |
| lancaster-gate | 4887 | 4.8% |
| thistletower | 4878 | 4.8% |
| stgileshotel | 4866 | 4.8% |
| z-trafalgar | 4655 | 4.6% |
| nyx-hotel-london-by-leonardo-hotels | 4487 | 4.4% |
| radissonblugrafton | 3675 | 3.6% |
| cityinnwestminster | 3630 | 3.6% |
| sidneyhotel | 3510 | 3.5% |
| ace | 3476 | 3.4% |
| Other values (23) | 58692 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 171730 | |
| o | 165628 | |
| t | 164754 | |
| l | 139017 | 8.7% |
| a | 116689 | 7.3% |
| r | 109509 | 6.8% |
| n | 99081 | 6.2% |
| s | 89360 | 5.6% |
| - | 80136 | 5.0% |
| h | 72689 | 4.5% |
| Other values (16) | 390134 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1598727 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 171730 | |
| o | 165628 | |
| t | 164754 | |
| l | 139017 | 8.7% |
| a | 116689 | 7.3% |
| r | 109509 | 6.8% |
| n | 99081 | 6.2% |
| s | 89360 | 5.6% |
| - | 80136 | 5.0% |
| h | 72689 | 4.5% |
| Other values (16) | 390134 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1598727 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 171730 | |
| o | 165628 | |
| t | 164754 | |
| l | 139017 | 8.7% |
| a | 116689 | 7.3% |
| r | 109509 | 6.8% |
| n | 99081 | 6.2% |
| s | 89360 | 5.6% |
| - | 80136 | 5.0% |
| h | 72689 | 4.5% |
| Other values (16) | 390134 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1598727 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 171730 | |
| o | 165628 | |
| t | 164754 | |
| l | 139017 | 8.7% |
| a | 116689 | 7.3% |
| r | 109509 | 6.8% |
| n | 99081 | 6.2% |
| s | 89360 | 5.6% |
| - | 80136 | 5.0% |
| h | 72689 | 4.5% |
| Other values (16) | 390134 |
Review_Text
Text
Unique 
| Distinct | 101693 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 794.6 KiB |
Length
| Max length | 3604 |
|---|---|
| Median length | 1901 |
| Mean length | 211.96352 |
| Min length | 1 |
Unique
| Unique | 101693 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | Perfect location with good connections and shops and pubs |
|---|---|
| 2nd row | The room had everything you needed. Near to amenities, was good room for price just needs little updatingThe bed was so hard it felt like sleeping on a hard floor, you had to make sure you had something on your feet as flooring pinched you feet needs changing |
| 3rd row | Conveniently nearby St. Pancras, very small but clean and pleasant room (first floor with small balcony to street side). Interesting area.Luggage service can be improved by offering to lock luggage up instead of it just being put into the hall with all risks on the guests. |
| 4th row | Reception staffed 24 hours a day.All good. |
| 5th row | Very convenient to King’s Cross and the cityA little dated could do with a lick of paint |
| Value | Count | Frequency (%) |
| the | 195386 | 5.2% |
| and | 135249 | 3.6% |
| was | 114450 | 3.0% |
| to | 88001 | 2.3% |
| a | 85622 | 2.3% |
| room | 65652 | 1.7% |
| in | 57965 | 1.5% |
| very | 49945 | 1.3% |
| for | 46262 | 1.2% |
| location | 44905 | 1.2% |
| Other values (112839) | 2881166 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3659420 | ||
| e | 2033083 | 9.4% |
| o | 1529032 | 7.1% |
| t | 1490307 | 6.9% |
| a | 1473688 | 6.8% |
| n | 1137142 | 5.3% |
| r | 1033282 | 4.8% |
| i | 1025785 | 4.8% |
| s | 958874 | 4.4% |
| l | 818520 | 3.8% |
| Other values (2182) | 6396073 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 21555206 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3659420 | ||
| e | 2033083 | 9.4% |
| o | 1529032 | 7.1% |
| t | 1490307 | 6.9% |
| a | 1473688 | 6.8% |
| n | 1137142 | 5.3% |
| r | 1033282 | 4.8% |
| i | 1025785 | 4.8% |
| s | 958874 | 4.4% |
| l | 818520 | 3.8% |
| Other values (2182) | 6396073 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 21555206 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3659420 | ||
| e | 2033083 | 9.4% |
| o | 1529032 | 7.1% |
| t | 1490307 | 6.9% |
| a | 1473688 | 6.8% |
| n | 1137142 | 5.3% |
| r | 1033282 | 4.8% |
| i | 1025785 | 4.8% |
| s | 958874 | 4.4% |
| l | 818520 | 3.8% |
| Other values (2182) | 6396073 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 21555206 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3659420 | ||
| e | 2033083 | 9.4% |
| o | 1529032 | 7.1% |
| t | 1490307 | 6.9% |
| a | 1473688 | 6.8% |
| n | 1137142 | 5.3% |
| r | 1033282 | 4.8% |
| i | 1025785 | 4.8% |
| s | 958874 | 4.4% |
| l | 818520 | 3.8% |
| Other values (2182) | 6396073 |
Posted_Date
Date
| Distinct | 1111 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 794.6 KiB |
| Minimum | 2021-12-01 00:00:00 |
|---|---|
| Maximum | 2024-12-16 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
Rating
Real number (ℝ)
| Distinct | 25 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.7064105 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 7 |
| median | 8 |
| Q3 | 9 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.9083027 |
|---|---|
| Coefficient of variation (CV) | 0.24762536 |
| Kurtosis | 2.0209368 |
| Mean | 7.7064105 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -1.2941539 |
| Sum | 783688 |
| Variance | 3.6416191 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 29854 | |
| 9 | 19098 | |
| 7 | 17189 | |
| 10 | 16699 | |
| 6 | 7238 | 7.1% |
| 5 | 4496 | 4.4% |
| 4 | 2487 | 2.4% |
| 3 | 1867 | 1.8% |
| 1 | 1774 | 1.7% |
| 2 | 917 | 0.9% |
| Other values (15) | 74 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 1774 | 1.7% |
| 2 | 917 | 0.9% |
| 2.5 | 1 | < 0.1% |
| 2.9 | 1 | < 0.1% |
| 3 | 1867 | |
| 3.8 | 1 | < 0.1% |
| 4 | 2487 | |
| 4.6 | 1 | < 0.1% |
| 5 | 4496 | |
| 5.4 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 10 | 16699 | |
| 9.6 | 15 | < 0.1% |
| 9.2 | 12 | < 0.1% |
| 9 | 19098 | |
| 8.8 | 8 | < 0.1% |
| 8.3 | 7 | < 0.1% |
| 8 | 29854 | |
| 7.9 | 9 | < 0.1% |
| 7.5 | 5 | < 0.1% |
| 7.1 | 3 | < 0.1% |
Average_Rating
Real number (ℝ)
High correlation 
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.843059 |
| Minimum | 7 |
|---|---|
| Maximum | 8.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 7 |
| Q1 | 7.7 |
| median | 7.8 |
| Q3 | 8.2 |
| 95-th percentile | 8.6 |
| Maximum | 8.7 |
| Range | 1.7 |
| Interquartile range (IQR) | 0.5 |
Descriptive statistics
| Standard deviation | 0.42189297 |
|---|---|
| Coefficient of variation (CV) | 0.05379189 |
| Kurtosis | -0.32844262 |
| Mean | 7.843059 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | 0.027417095 |
| Sum | 797584.2 |
| Variance | 0.17799368 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.7 | 24861 | |
| 7.8 | 10060 | |
| 7.9 | 9511 | 9.4% |
| 8.4 | 9142 | 9.0% |
| 7.4 | 7762 | 7.6% |
| 7 | 6849 | 6.7% |
| 8.3 | 6557 | 6.4% |
| 7.6 | 5503 | 5.4% |
| 8.6 | 4746 | 4.7% |
| 8 | 4324 | 4.3% |
| Other values (5) | 12378 |
| Value | Count | Frequency (%) |
| 7 | 6849 | 6.7% |
| 7.1 | 2042 | 2.0% |
| 7.4 | 7762 | 7.6% |
| 7.5 | 2159 | 2.1% |
| 7.6 | 5503 | 5.4% |
| 7.7 | 24861 | |
| 7.8 | 10060 | |
| 7.9 | 9511 | 9.4% |
| 8 | 4324 | 4.3% |
| 8.1 | 3046 | 3.0% |
| Value | Count | Frequency (%) |
| 8.7 | 2568 | 2.5% |
| 8.6 | 4746 | 4.7% |
| 8.4 | 9142 | 9.0% |
| 8.3 | 6557 | 6.4% |
| 8.2 | 2563 | 2.5% |
| 8.1 | 3046 | 3.0% |
| 8 | 4324 | 4.3% |
| 7.9 | 9511 | 9.4% |
| 7.8 | 10060 | |
| 7.7 | 24861 |
Num_of_Ratings
Real number (ℝ)
High correlation 
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11495.815 |
| Minimum | 5613 |
|---|---|
| Maximum | 39497 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 5613 |
|---|---|
| 5-th percentile | 5898 |
| Q1 | 6515 |
| median | 9382 |
| Q3 | 13923 |
| 95-th percentile | 20956 |
| Maximum | 39497 |
| Range | 33884 |
| Interquartile range (IQR) | 7408 |
Descriptive statistics
| Standard deviation | 7416.8887 |
|---|---|
| Coefficient of variation (CV) | 0.64518163 |
| Kurtosis | 7.0863642 |
| Mean | 11495.815 |
| Median Absolute Deviation (MAD) | 3105 |
| Skewness | 2.5945932 |
| Sum | 1.1690439 × 109 |
| Variance | 55010238 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 39497 | 4937 | 4.9% |
| 14445 | 4887 | 4.8% |
| 20956 | 4878 | 4.8% |
| 14989 | 4866 | 4.8% |
| 13923 | 4655 | 4.6% |
| 9394 | 4487 | 4.4% |
| 9315 | 3675 | 3.6% |
| 7467 | 3630 | 3.6% |
| 12641 | 3510 | 3.5% |
| 6277 | 3476 | 3.4% |
| Other values (23) | 58692 |
| Value | Count | Frequency (%) |
| 5613 | 1895 | |
| 5715 | 1960 | |
| 5898 | 2563 | |
| 5932 | 3006 | |
| 5933 | 2953 | |
| 6120 | 2033 | |
| 6248 | 2502 | |
| 6277 | 3476 | |
| 6335 | 2127 | |
| 6404 | 1983 |
| Value | Count | Frequency (%) |
| 39497 | 4937 | |
| 20956 | 4878 | |
| 15320 | 2825 | |
| 14989 | 4866 | |
| 14445 | 4887 | |
| 13923 | 4655 | |
| 12641 | 3510 | |
| 12340 | 2927 | |
| 11670 | 3470 | |
| 11045 | 2786 |
Helpfulness
Real number (ℝ)
Zeros 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.11225945 |
| Minimum | 0 |
|---|---|
| Maximum | 14 |
| Zeros | 92031 |
| Zeros (%) | 90.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 14 |
| Range | 14 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.38674756 |
|---|---|
| Coefficient of variation (CV) | 3.4451226 |
| Kurtosis | 58.132625 |
| Mean | 0.11225945 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.2864687 |
| Sum | 11416 |
| Variance | 0.14957367 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 92031 | |
| 1 | 8340 | 8.2% |
| 2 | 1050 | 1.0% |
| 3 | 192 | 0.2% |
| 4 | 42 | < 0.1% |
| 5 | 21 | < 0.1% |
| 6 | 9 | < 0.1% |
| 10 | 2 | < 0.1% |
| 7 | 2 | < 0.1% |
| 8 | 2 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 92031 | |
| 1 | 8340 | 8.2% |
| 2 | 1050 | 1.0% |
| 3 | 192 | 0.2% |
| 4 | 42 | < 0.1% |
| 5 | 21 | < 0.1% |
| 6 | 9 | < 0.1% |
| 7 | 2 | < 0.1% |
| 8 | 2 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 14 | 1 | < 0.1% |
| 10 | 2 | < 0.1% |
| 9 | 1 | < 0.1% |
| 8 | 2 | < 0.1% |
| 7 | 2 | < 0.1% |
| 6 | 9 | < 0.1% |
| 5 | 21 | < 0.1% |
| 4 | 42 | < 0.1% |
| 3 | 192 | 0.2% |
| 2 | 1050 |
is_photo
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 794.6 KiB |
| 0 | |
|---|---|
| 1 | 5261 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 96432 | |
| 1 | 5261 | 5.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 96432 | |
| 1 | 5261 | 5.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 96432 | |
| 1 | 5261 | 5.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 101693 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 96432 | |
| 1 | 5261 | 5.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 101693 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 96432 | |
| 1 | 5261 | 5.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 101693 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 96432 | |
| 1 | 5261 | 5.2% |
review_title
Text
| Distinct | 58013 |
|---|---|
| Distinct (%) | 57.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 794.6 KiB |
Length
| Max length | 120 |
|---|---|
| Median length | 104 |
| Mean length | 30.736147 |
| Min length | 1 |
Unique
| Unique | 55598 ? |
|---|---|
| Unique (%) | 54.7% |
Sample
| 1st row | Exceptional |
|---|---|
| 2nd row | Very good |
| 3rd row | Convenient location |
| 4th row | Peaceful position in an elegant street close to 3 major stations and the Bloomsbury area. |
| 5th row | Great little gem in the city centre |
| Value | Count | Frequency (%) |
| good | 27150 | 5.0% |
| location | 17475 | 3.2% |
| very | 17180 | 3.2% |
| and | 17051 | 3.2% |
| great | 15107 | 2.8% |
| stay | 14641 | 2.7% |
| a | 14223 | 2.6% |
| for | 13078 | 2.4% |
| the | 12811 | 2.4% |
| hotel | 12270 | 2.3% |
| Other values (16009) | 377523 |
Most occurring characters
| Value | Count | Frequency (%) |
| 438406 | ||
| e | 292527 | 9.4% |
| o | 268475 | 8.6% |
| a | 226999 | 7.3% |
| t | 223340 | 7.1% |
| n | 175424 | 5.6% |
| r | 151181 | 4.8% |
| l | 147815 | 4.7% |
| i | 144651 | 4.6% |
| s | 109867 | 3.5% |
| Other values (843) | 946966 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3125651 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 438406 | ||
| e | 292527 | 9.4% |
| o | 268475 | 8.6% |
| a | 226999 | 7.3% |
| t | 223340 | 7.1% |
| n | 175424 | 5.6% |
| r | 151181 | 4.8% |
| l | 147815 | 4.7% |
| i | 144651 | 4.6% |
| s | 109867 | 3.5% |
| Other values (843) | 946966 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3125651 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 438406 | ||
| e | 292527 | 9.4% |
| o | 268475 | 8.6% |
| a | 226999 | 7.3% |
| t | 223340 | 7.1% |
| n | 175424 | 5.6% |
| r | 151181 | 4.8% |
| l | 147815 | 4.7% |
| i | 144651 | 4.6% |
| s | 109867 | 3.5% |
| Other values (843) | 946966 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3125651 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 438406 | ||
| e | 292527 | 9.4% |
| o | 268475 | 8.6% |
| a | 226999 | 7.3% |
| t | 223340 | 7.1% |
| n | 175424 | 5.6% |
| r | 151181 | 4.8% |
| l | 147815 | 4.7% |
| i | 144651 | 4.6% |
| s | 109867 | 3.5% |
| Other values (843) | 946966 |
hotel_grade
Categorical
High correlation 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 794.6 KiB |
| 4 | |
|---|---|
| 3 | |
| 5 | |
| 0 | |
| 2 | 2153 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 3 |
| 3rd row | 3 |
| 4th row | 3 |
| 5th row | 3 |
Common Values
| Value | Count | Frequency (%) |
| 4 | 45517 | |
| 3 | 41258 | |
| 5 | 7828 | 7.7% |
| 0 | 4937 | 4.9% |
| 2 | 2153 | 2.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 4 | 45517 | |
| 3 | 41258 | |
| 5 | 7828 | 7.7% |
| 0 | 4937 | 4.9% |
| 2 | 2153 | 2.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 45517 | |
| 3 | 41258 | |
| 5 | 7828 | 7.7% |
| 0 | 4937 | 4.9% |
| 2 | 2153 | 2.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 101693 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 4 | 45517 | |
| 3 | 41258 | |
| 5 | 7828 | 7.7% |
| 0 | 4937 | 4.9% |
| 2 | 2153 | 2.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 101693 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 4 | 45517 | |
| 3 | 41258 | |
| 5 | 7828 | 7.7% |
| 0 | 4937 | 4.9% |
| 2 | 2153 | 2.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 101693 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 4 | 45517 | |
| 3 | 41258 | |
| 5 | 7828 | 7.7% |
| 0 | 4937 | 4.9% |
| 2 | 2153 | 2.1% |
employee_friendliness_score
Real number (ℝ)
High correlation 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.5473327 |
| Minimum | 7.5 |
|---|---|
| Maximum | 9.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 7.5 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 8.4 |
| median | 8.6 |
| Q3 | 8.7 |
| 95-th percentile | 9.1 |
| Maximum | 9.1 |
| Range | 1.6 |
| Interquartile range (IQR) | 0.3 |
Descriptive statistics
| Standard deviation | 0.36326641 |
|---|---|
| Coefficient of variation (CV) | 0.042500558 |
| Kurtosis | 0.82320249 |
| Mean | 8.5473327 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | -0.76187002 |
| Sum | 869203.9 |
| Variance | 0.13196248 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8.7 | 20691 | |
| 8.6 | 14473 | |
| 8.4 | 10387 | |
| 8.1 | 9803 | |
| 9.1 | 9183 | |
| 8.5 | 8186 | 8.0% |
| 9 | 7851 | 7.7% |
| 8.8 | 6513 | 6.4% |
| 8.3 | 5629 | 5.5% |
| 7.5 | 4025 | 4.0% |
| Other values (2) | 4952 | 4.9% |
| Value | Count | Frequency (%) |
| 7.5 | 4025 | 4.0% |
| 8 | 2825 | 2.8% |
| 8.1 | 9803 | |
| 8.2 | 2127 | 2.1% |
| 8.3 | 5629 | 5.5% |
| 8.4 | 10387 | |
| 8.5 | 8186 | 8.0% |
| 8.6 | 14473 | |
| 8.7 | 20691 | |
| 8.8 | 6513 | 6.4% |
| Value | Count | Frequency (%) |
| 9.1 | 9183 | |
| 9 | 7851 | 7.7% |
| 8.8 | 6513 | 6.4% |
| 8.7 | 20691 | |
| 8.6 | 14473 | |
| 8.5 | 8186 | 8.0% |
| 8.4 | 10387 | |
| 8.3 | 5629 | 5.5% |
| 8.2 | 2127 | 2.1% |
| 8.1 | 9803 |
facility_score
Real number (ℝ)
High correlation 
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.8512651 |
| Minimum | 6.9 |
|---|---|
| Maximum | 8.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 6.9 |
|---|---|
| 5-th percentile | 6.9 |
| Q1 | 7.5 |
| median | 7.8 |
| Q3 | 8.3 |
| 95-th percentile | 8.7 |
| Maximum | 8.7 |
| Range | 1.8 |
| Interquartile range (IQR) | 0.8 |
Descriptive statistics
| Standard deviation | 0.49550699 |
|---|---|
| Coefficient of variation (CV) | 0.063111739 |
| Kurtosis | -0.69441912 |
| Mean | 7.8512651 |
| Median Absolute Deviation (MAD) | 0.3 |
| Skewness | 0.06281807 |
| Sum | 798418.7 |
| Variance | 0.24552718 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.8 | 13816 | |
| 7.5 | 12549 | |
| 7.6 | 10037 | |
| 8.7 | 9841 | |
| 6.9 | 6849 | 6.7% |
| 8.3 | 6013 | 5.9% |
| 7.7 | 5978 | 5.9% |
| 8 | 5872 | 5.8% |
| 7.2 | 4937 | 4.9% |
| 8.1 | 4690 | 4.6% |
| Other values (7) | 21111 |
| Value | Count | Frequency (%) |
| 6.9 | 6849 | |
| 7.2 | 4937 | 4.9% |
| 7.3 | 2042 | 2.0% |
| 7.4 | 2825 | 2.8% |
| 7.5 | 12549 | |
| 7.6 | 10037 | |
| 7.7 | 5978 | |
| 7.8 | 13816 | |
| 7.9 | 2953 | 2.9% |
| 8 | 5872 |
| Value | Count | Frequency (%) |
| 8.7 | 9841 | |
| 8.6 | 1960 | 1.9% |
| 8.5 | 3630 | 3.6% |
| 8.4 | 4655 | 4.6% |
| 8.3 | 6013 | |
| 8.2 | 3046 | 3.0% |
| 8.1 | 4690 | 4.6% |
| 8 | 5872 | |
| 7.9 | 2953 | 2.9% |
| 7.8 | 13816 |
cleanliness_score
Real number (ℝ)
High correlation 
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.2549114 |
| Minimum | 7.3 |
|---|---|
| Maximum | 9.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 7.3 |
|---|---|
| 5-th percentile | 7.4 |
| Q1 | 8 |
| median | 8.2 |
| Q3 | 8.7 |
| 95-th percentile | 8.8 |
| Maximum | 9.1 |
| Range | 1.8 |
| Interquartile range (IQR) | 0.7 |
Descriptive statistics
| Standard deviation | 0.4354913 |
|---|---|
| Coefficient of variation (CV) | 0.052755418 |
| Kurtosis | -0.29176239 |
| Mean | 8.2549114 |
| Median Absolute Deviation (MAD) | 0.3 |
| Skewness | -0.18715622 |
| Sum | 839466.7 |
| Variance | 0.18965267 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8.7 | 13231 | |
| 8.2 | 11895 | |
| 8.8 | 10903 | |
| 8.1 | 10541 | |
| 7.9 | 10440 | |
| 8 | 9929 | |
| 8.3 | 8660 | |
| 8.4 | 7723 | |
| 7.3 | 4866 | 4.8% |
| 9.1 | 4528 | 4.5% |
| Other values (4) | 8977 |
| Value | Count | Frequency (%) |
| 7.3 | 4866 | |
| 7.4 | 1983 | 1.9% |
| 7.5 | 2042 | 2.0% |
| 7.8 | 2825 | 2.8% |
| 7.9 | 10440 | |
| 8 | 9929 | |
| 8.1 | 10541 | |
| 8.2 | 11895 | |
| 8.3 | 8660 | |
| 8.4 | 7723 |
| Value | Count | Frequency (%) |
| 9.1 | 4528 | 4.5% |
| 8.8 | 10903 | |
| 8.7 | 13231 | |
| 8.5 | 2127 | 2.1% |
| 8.4 | 7723 | |
| 8.3 | 8660 | |
| 8.2 | 11895 | |
| 8.1 | 10541 | |
| 8 | 9929 | |
| 7.9 | 10440 |
comfort_score
Real number (ℝ)
High correlation 
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.2442695 |
| Minimum | 7.3 |
|---|---|
| Maximum | 9.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 7.3 |
|---|---|
| 5-th percentile | 7.3 |
| Q1 | 8 |
| median | 8.2 |
| Q3 | 8.7 |
| 95-th percentile | 8.9 |
| Maximum | 9.1 |
| Range | 1.8 |
| Interquartile range (IQR) | 0.7 |
Descriptive statistics
| Standard deviation | 0.4590495 |
|---|---|
| Coefficient of variation (CV) | 0.055681039 |
| Kurtosis | -0.48364911 |
| Mean | 8.2442695 |
| Median Absolute Deviation (MAD) | 0.3 |
| Skewness | -0.11806461 |
| Sum | 838384.5 |
| Variance | 0.21072644 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 16285 | |
| 8.2 | 10426 | |
| 7.9 | 9921 | |
| 8.8 | 9643 | |
| 8.1 | 9490 | |
| 8.9 | 7273 | |
| 8.3 | 6851 | |
| 7.3 | 6849 | |
| 8.5 | 6238 | 6.1% |
| 8.7 | 4655 | 4.6% |
| Other values (6) | 14062 |
| Value | Count | Frequency (%) |
| 7.3 | 6849 | |
| 7.4 | 2042 | 2.0% |
| 7.8 | 3470 | 3.4% |
| 7.9 | 9921 | |
| 8 | 16285 | |
| 8.1 | 9490 | |
| 8.2 | 10426 | |
| 8.3 | 6851 | |
| 8.4 | 1895 | 1.9% |
| 8.5 | 6238 | 6.1% |
| Value | Count | Frequency (%) |
| 9.1 | 2568 | 2.5% |
| 9 | 1960 | 1.9% |
| 8.9 | 7273 | |
| 8.8 | 9643 | |
| 8.7 | 4655 | |
| 8.6 | 2127 | 2.1% |
| 8.5 | 6238 | |
| 8.4 | 1895 | 1.9% |
| 8.3 | 6851 | |
| 8.2 | 10426 |
value_for_money_score
Real number (ℝ)
High correlation 
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.7193278 |
| Minimum | 7 |
|---|---|
| Maximum | 8.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 7.3 |
| Q1 | 7.5 |
| median | 7.7 |
| Q3 | 7.9 |
| 95-th percentile | 8.2 |
| Maximum | 8.3 |
| Range | 1.3 |
| Interquartile range (IQR) | 0.4 |
Descriptive statistics
| Standard deviation | 0.32353898 |
|---|---|
| Coefficient of variation (CV) | 0.041912843 |
| Kurtosis | -0.62953772 |
| Mean | 7.7193278 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | -0.18308805 |
| Sum | 785001.6 |
| Variance | 0.10467747 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.9 | 24134 | |
| 7.4 | 13927 | |
| 7.5 | 12809 | |
| 7.7 | 9056 | 8.9% |
| 8.1 | 8966 | 8.8% |
| 7.6 | 8703 | 8.6% |
| 8 | 5070 | 5.0% |
| 7.3 | 4984 | 4.9% |
| 7 | 4866 | 4.8% |
| 8.2 | 4655 | 4.6% |
| Value | Count | Frequency (%) |
| 7 | 4866 | 4.8% |
| 7.3 | 4984 | 4.9% |
| 7.4 | 13927 | |
| 7.5 | 12809 | |
| 7.6 | 8703 | 8.6% |
| 7.7 | 9056 | 8.9% |
| 7.9 | 24134 | |
| 8 | 5070 | 5.0% |
| 8.1 | 8966 | 8.8% |
| 8.2 | 4655 | 4.6% |
| Value | Count | Frequency (%) |
| 8.3 | 4523 | 4.4% |
| 8.2 | 4655 | 4.6% |
| 8.1 | 8966 | 8.8% |
| 8 | 5070 | 5.0% |
| 7.9 | 24134 | |
| 7.7 | 9056 | 8.9% |
| 7.6 | 8703 | 8.6% |
| 7.5 | 12809 | |
| 7.4 | 13927 | |
| 7.3 | 4984 | 4.9% |
location_score
Real number (ℝ)
High correlation 
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.1694423 |
| Minimum | 8.2 |
|---|---|
| Maximum | 9.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 8.2 |
|---|---|
| 5-th percentile | 8.6 |
| Q1 | 9 |
| median | 9.1 |
| Q3 | 9.4 |
| 95-th percentile | 9.6 |
| Maximum | 9.7 |
| Range | 1.5 |
| Interquartile range (IQR) | 0.4 |
Descriptive statistics
| Standard deviation | 0.30996718 |
|---|---|
| Coefficient of variation (CV) | 0.033804366 |
| Kurtosis | 0.46670126 |
| Mean | 9.1694423 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | -0.51048551 |
| Sum | 932468.1 |
| Variance | 0.096079653 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8.9 | 16580 | |
| 9.1 | 15659 | |
| 9.4 | 12474 | |
| 9 | 11728 | |
| 9.3 | 11003 | |
| 9.5 | 7664 | |
| 9.6 | 7505 | |
| 9.2 | 6710 | |
| 8.6 | 5673 | 5.6% |
| 9.7 | 4655 | 4.6% |
| Value | Count | Frequency (%) |
| 8.2 | 2042 | 2.0% |
| 8.6 | 5673 | 5.6% |
| 8.9 | 16580 | |
| 9 | 11728 | |
| 9.1 | 15659 | |
| 9.2 | 6710 | |
| 9.3 | 11003 | |
| 9.4 | 12474 | |
| 9.5 | 7664 | |
| 9.6 | 7505 |
| Value | Count | Frequency (%) |
| 9.7 | 4655 | 4.6% |
| 9.6 | 7505 | |
| 9.5 | 7664 | |
| 9.4 | 12474 | |
| 9.3 | 11003 | |
| 9.2 | 6710 | |
| 9.1 | 15659 | |
| 9 | 11728 | |
| 8.9 | 16580 | |
| 8.6 | 5673 | 5.6% |
Crawled_date
Categorical
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 794.6 KiB |
| 2024-12-02 | |
|---|---|
| 2024-12-16 | 2502 |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2024-12-02 |
|---|---|
| 2nd row | 2024-12-02 |
| 3rd row | 2024-12-02 |
| 4th row | 2024-12-02 |
| 5th row | 2024-12-02 |
Common Values
| Value | Count | Frequency (%) |
| 2024-12-02 | 99191 | |
| 2024-12-16 | 2502 | 2.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2024-12-02 | 99191 | |
| 2024-12-16 | 2502 | 2.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 404270 | |
| - | 203386 | |
| 0 | 200884 | |
| 1 | 104195 | 10.2% |
| 4 | 101693 | 10.0% |
| 6 | 2502 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1016930 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 404270 | |
| - | 203386 | |
| 0 | 200884 | |
| 1 | 104195 | 10.2% |
| 4 | 101693 | 10.0% |
| 6 | 2502 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1016930 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 404270 | |
| - | 203386 | |
| 0 | 200884 | |
| 1 | 104195 | 10.2% |
| 4 | 101693 | 10.0% |
| 6 | 2502 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1016930 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 404270 | |
| - | 203386 | |
| 0 | 200884 | |
| 1 | 104195 | 10.2% |
| 4 | 101693 | 10.0% |
| 6 | 2502 | 0.2% |
title_length
Real number (ℝ)
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.2954382 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 4 |
| Q3 | 8 |
| 95-th percentile | 16 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 5.0035407 |
|---|---|
| Coefficient of variation (CV) | 0.94487754 |
| Kurtosis | 1.8423706 |
| Mean | 5.2954382 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 1.4666878 |
| Sum | 538509 |
| Variance | 25.035419 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 27471 | |
| 2 | 17294 | |
| 4 | 7031 | 6.9% |
| 5 | 7021 | 6.9% |
| 6 | 6528 | 6.4% |
| 7 | 5387 | 5.3% |
| 3 | 5345 | 5.3% |
| 8 | 4403 | 4.3% |
| 9 | 3564 | 3.5% |
| 10 | 2888 | 2.8% |
| Other values (20) | 14761 |
| Value | Count | Frequency (%) |
| 1 | 27471 | |
| 2 | 17294 | |
| 3 | 5345 | 5.3% |
| 4 | 7031 | 6.9% |
| 5 | 7021 | 6.9% |
| 6 | 6528 | 6.4% |
| 7 | 5387 | 5.3% |
| 8 | 4403 | 4.3% |
| 9 | 3564 | 3.5% |
| 10 | 2888 | 2.8% |
| Value | Count | Frequency (%) |
| 31 | 1 | < 0.1% |
| 29 | 2 | < 0.1% |
| 28 | 7 | < 0.1% |
| 27 | 26 | < 0.1% |
| 26 | 49 | < 0.1% |
| 25 | 132 | 0.1% |
| 24 | 250 | |
| 23 | 312 | |
| 22 | 405 | |
| 21 | 482 |
text_length
Real number (ℝ)
High correlation 
| Distinct | 424 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37.019293 |
| Minimum | 1 |
|---|---|
| Maximum | 666 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 11 |
| median | 24 |
| Q3 | 48 |
| 95-th percentile | 113 |
| Maximum | 666 |
| Range | 665 |
| Interquartile range (IQR) | 37 |
Descriptive statistics
| Standard deviation | 41.234118 |
|---|---|
| Coefficient of variation (CV) | 1.1138548 |
| Kurtosis | 16.883715 |
| Mean | 37.019293 |
| Median Absolute Deviation (MAD) | 15 |
| Skewness | 3.1942633 |
| Sum | 3764603 |
| Variance | 1700.2525 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 2864 | 2.8% |
| 5 | 2830 | 2.8% |
| 7 | 2811 | 2.8% |
| 9 | 2786 | 2.7% |
| 8 | 2764 | 2.7% |
| 10 | 2639 | 2.6% |
| 4 | 2571 | 2.5% |
| 12 | 2526 | 2.5% |
| 11 | 2489 | 2.4% |
| 14 | 2310 | 2.3% |
| Other values (414) | 75103 |
| Value | Count | Frequency (%) |
| 1 | 756 | 0.7% |
| 2 | 1375 | |
| 3 | 2050 | |
| 4 | 2571 | |
| 5 | 2830 | |
| 6 | 2864 | |
| 7 | 2811 | |
| 8 | 2764 | |
| 9 | 2786 | |
| 10 | 2639 |
| Value | Count | Frequency (%) |
| 666 | 1 | |
| 571 | 1 | |
| 568 | 1 | |
| 527 | 1 | |
| 510 | 1 | |
| 503 | 1 | |
| 493 | 1 | |
| 491 | 1 | |
| 479 | 1 | |
| 471 | 1 |
time_lapsed
Real number (ℝ)
| Distinct | 1098 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 540.84004 |
| Minimum | 0 |
|---|---|
| Maximum | 1097 |
| Zeros | 74 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 58 |
| Q1 | 273 |
| median | 532 |
| Q3 | 819 |
| 95-th percentile | 1020 |
| Maximum | 1097 |
| Range | 1097 |
| Interquartile range (IQR) | 546 |
Descriptive statistics
| Standard deviation | 310.48773 |
|---|---|
| Coefficient of variation (CV) | 0.57408422 |
| Kurtosis | -1.1743369 |
| Mean | 540.84004 |
| Median Absolute Deviation (MAD) | 274 |
| Skewness | 0.012506183 |
| Sum | 54999646 |
| Variance | 96402.632 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 518 | 204 | 0.2% |
| 511 | 203 | 0.2% |
| 1008 | 187 | 0.2% |
| 280 | 185 | 0.2% |
| 882 | 183 | 0.2% |
| 524 | 178 | 0.2% |
| 910 | 178 | 0.2% |
| 1001 | 175 | 0.2% |
| 616 | 171 | 0.2% |
| 1015 | 169 | 0.2% |
| Other values (1088) | 99860 |
| Value | Count | Frequency (%) |
| 0 | 74 | |
| 1 | 102 | |
| 2 | 103 | |
| 3 | 103 | |
| 4 | 90 | |
| 5 | 68 | |
| 6 | 94 | |
| 7 | 142 | |
| 8 | 81 | |
| 9 | 67 |
| Value | Count | Frequency (%) |
| 1097 | 3 | < 0.1% |
| 1096 | 43 | < 0.1% |
| 1095 | 54 | 0.1% |
| 1094 | 59 | 0.1% |
| 1093 | 78 | |
| 1092 | 152 | |
| 1091 | 89 | |
| 1090 | 79 | |
| 1089 | 80 | |
| 1088 | 70 |
Deviation of star ratings
Real number (ℝ)
Zeros 
| Distinct | 104 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3356278 |
| Minimum | 0 |
|---|---|
| Maximum | 7.7 |
| Zeros | 2537 |
| Zeros (%) | 2.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.1 |
| Q1 | 0.4 |
| median | 1 |
| Q3 | 1.7 |
| 95-th percentile | 4 |
| Maximum | 7.7 |
| Range | 7.7 |
| Interquartile range (IQR) | 1.3 |
Descriptive statistics
| Standard deviation | 1.2913934 |
|---|---|
| Coefficient of variation (CV) | 0.96688116 |
| Kurtosis | 5.1357452 |
| Mean | 1.3356278 |
| Median Absolute Deviation (MAD) | 0.6 |
| Skewness | 2.0643546 |
| Sum | 135824 |
| Variance | 1.6676968 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.6 | 7857 | 7.7% |
| 0.4 | 7761 | 7.6% |
| 0.3 | 7703 | 7.6% |
| 1.4 | 5198 | 5.1% |
| 1.3 | 4863 | 4.8% |
| 0.7 | 4656 | 4.6% |
| 1 | 4094 | 4.0% |
| 0.1 | 4026 | 4.0% |
| 1.6 | 3352 | 3.3% |
| 2.3 | 3031 | 3.0% |
| Other values (94) | 49152 |
| Value | Count | Frequency (%) |
| 0 | 2537 | 2.5% |
| 0.1 | 4026 | |
| 0.2 | 922 | 0.9% |
| 0.2 | 2860 | 2.8% |
| 0.3 | 7703 | |
| 0.3 | 2451 | 2.4% |
| 0.4 | 7761 | |
| 0.5 | 929 | 0.9% |
| 0.6 | 7857 | |
| 0.6 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 7.7 | 17 | < 0.1% |
| 7.6 | 18 | < 0.1% |
| 7.4 | 110 | 0.1% |
| 7.3 | 68 | 0.1% |
| 7.2 | 8 | < 0.1% |
| 7.1 | 71 | 0.1% |
| 7 | 93 | 0.1% |
| 6.9 | 120 | 0.1% |
| 6.8 | 175 | 0.2% |
| 6.7 | 492 |
FOG Index
Real number (ℝ)
High correlation 
| Distinct | 2063 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.7008664 |
| Minimum | 0 |
|---|---|
| Maximum | 142.24 |
| Zeros | 13 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2.8 |
| Q1 | 6.61 |
| median | 8.57 |
| Q3 | 11.6 |
| 95-th percentile | 18.84 |
| Maximum | 142.24 |
| Range | 142.24 |
| Interquartile range (IQR) | 4.99 |
Descriptive statistics
| Standard deviation | 5.3759318 |
|---|---|
| Coefficient of variation (CV) | 0.55417027 |
| Kurtosis | 18.798141 |
| Mean | 9.7008664 |
| Median Absolute Deviation (MAD) | 2.49 |
| Skewness | 2.6341535 |
| Sum | 986510.21 |
| Variance | 28.900643 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8.04 | 2281 | 2.2% |
| 10 | 2164 | 2.1% |
| 9.07 | 1680 | 1.7% |
| 8.51 | 1616 | 1.6% |
| 11.6 | 1607 | 1.6% |
| 8.2 | 1549 | 1.5% |
| 8 | 1276 | 1.3% |
| 14.53 | 1103 | 1.1% |
| 13.2 | 959 | 0.9% |
| 12 | 898 | 0.9% |
| Other values (2053) | 86560 |
| Value | Count | Frequency (%) |
| 0 | 13 | < 0.1% |
| 0.4 | 344 | |
| 0.8 | 478 | |
| 1 | 18 | < 0.1% |
| 1.08 | 1 | < 0.1% |
| 1.2 | 716 | |
| 1.32 | 8 | < 0.1% |
| 1.4 | 56 | 0.1% |
| 1.48 | 15 | < 0.1% |
| 1.52 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 142.24 | 1 | < 0.1% |
| 120.4 | 1 | < 0.1% |
| 106.39 | 1 | < 0.1% |
| 104.98 | 1 | < 0.1% |
| 86.1 | 1 | < 0.1% |
| 84.97 | 1 | < 0.1% |
| 80.4 | 5 | |
| 74.51 | 1 | < 0.1% |
| 72.32 | 1 | < 0.1% |
| 66 | 1 | < 0.1% |
Flesch Reading Ease
Real number (ℝ)
High correlation 
| Distinct | 2435 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 65.499026 |
| Minimum | -555.59 |
|---|---|
| Maximum | 206.84 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 2710 |
| Negative (%) | 2.7% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | -555.59 |
|---|---|
| 5-th percentile | 21.74 |
| Q1 | 56.93 |
| median | 71.44 |
| Q3 | 81.02 |
| 95-th percentile | 93.81 |
| Maximum | 206.84 |
| Range | 762.43 |
| Interquartile range (IQR) | 24.09 |
Descriptive statistics
| Standard deviation | 29.164817 |
|---|---|
| Coefficient of variation (CV) | 0.445271 |
| Kurtosis | 36.970882 |
| Mean | 65.499026 |
| Median Absolute Deviation (MAD) | 11.21 |
| Skewness | -4.1086938 |
| Sum | 6660792.5 |
| Variance | 850.58653 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 68.77 | 845 | 0.8% |
| 73.85 | 806 | 0.8% |
| 79.26 | 796 | 0.8% |
| 71.82 | 788 | 0.8% |
| 81.29 | 785 | 0.8% |
| 64.37 | 750 | 0.7% |
| 56.93 | 747 | 0.7% |
| 80.28 | 743 | 0.7% |
| 78.25 | 709 | 0.7% |
| 63.36 | 707 | 0.7% |
| Other values (2425) | 94017 |
| Value | Count | Frequency (%) |
| -555.59 | 2 | < 0.1% |
| -470.99 | 3 | < 0.1% |
| -386.39 | 5 | < 0.1% |
| -301.79 | 41 | < 0.1% |
| -265.85 | 1 | < 0.1% |
| -260.5 | 1 | < 0.1% |
| -219.22 | 1 | < 0.1% |
| -218.2 | 5 | < 0.1% |
| -217.19 | 104 | |
| -177.93 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 206.84 | 13 | < 0.1% |
| 121.22 | 194 | |
| 120.21 | 171 | |
| 119.19 | 184 | |
| 118.68 | 1 | < 0.1% |
| 118.18 | 131 | |
| 117.77 | 1 | < 0.1% |
| 117.67 | 6 | < 0.1% |
| 117.26 | 1 | < 0.1% |
| 117.16 | 116 |
depth
Real number (ℝ)
High correlation 
| Distinct | 94036 |
|---|---|
| Distinct (%) | 92.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.51462251 |
| Minimum | 9.9788716 × 10-18 |
|---|---|
| Maximum | 1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 9.9788716 × 10-18 |
|---|---|
| 5-th percentile | 0.039373492 |
| Q1 | 0.38563247 |
| median | 0.55302141 |
| Q3 | 0.67136632 |
| 95-th percentile | 0.80328745 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 0.28573385 |
Descriptive statistics
| Standard deviation | 0.21836477 |
|---|---|
| Coefficient of variation (CV) | 0.4243203 |
| Kurtosis | -0.0061038129 |
| Mean | 0.51462251 |
| Median Absolute Deviation (MAD) | 0.13414015 |
| Skewness | -0.56272271 |
| Sum | 52333.507 |
| Variance | 0.047683175 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 1702 | 1.7% |
| 0.002382441358 | 243 | 0.2% |
| 9.97887157 × 10-18 | 128 | 0.1% |
| 1.939563365 × 10-17 | 104 | 0.1% |
| 1.32685554 × 10-17 | 97 | 0.1% |
| 1.135575798 × 10-17 | 83 | 0.1% |
| 1.049586256 × 10-17 | 77 | 0.1% |
| 1.000506049 × 10-17 | 73 | 0.1% |
| 0.2829621402 | 72 | 0.1% |
| 0.2522073769 | 66 | 0.1% |
| Other values (94026) | 99048 |
| Value | Count | Frequency (%) |
| 9.97887157 × 10-18 | 128 | |
| 1.000506049 × 10-17 | 73 | |
| 1.049586256 × 10-17 | 77 | |
| 1.077492818 × 10-17 | 1 | < 0.1% |
| 1.0910371 × 10-17 | 41 | < 0.1% |
| 1.11725568 × 10-17 | 1 | < 0.1% |
| 1.120619813 × 10-17 | 1 | < 0.1% |
| 1.128725693 × 10-17 | 1 | < 0.1% |
| 1.135575798 × 10-17 | 83 | |
| 1.135964433 × 10-17 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 1702 | |
| 0.9617011723 | 1 | < 0.1% |
| 0.9590542372 | 1 | < 0.1% |
| 0.9544693421 | 1 | < 0.1% |
| 0.9543592049 | 1 | < 0.1% |
| 0.9502031837 | 1 | < 0.1% |
| 0.9448245866 | 1 | < 0.1% |
| 0.9408650086 | 1 | < 0.1% |
| 0.9407216602 | 1 | < 0.1% |
| 0.9393919822 | 1 | < 0.1% |
breadth
Real number (ℝ)
High correlation 
| Distinct | 93119 |
|---|---|
| Distinct (%) | 91.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.0915893 |
| Minimum | 0.11451788 |
|---|---|
| Maximum | 5.402044 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 794.6 KiB |
Quantile statistics
| Minimum | 0.11451788 |
|---|---|
| 5-th percentile | 0.92230132 |
| Q1 | 1.4424259 |
| median | 1.9107594 |
| Q3 | 2.5383131 |
| 95-th percentile | 4.0986157 |
| Maximum | 5.402044 |
| Range | 5.2875261 |
| Interquartile range (IQR) | 1.0958872 |
Descriptive statistics
| Standard deviation | 0.94935818 |
|---|---|
| Coefficient of variation (CV) | 0.45389321 |
| Kurtosis | 1.1123125 |
| Mean | 2.0915893 |
| Median Absolute Deviation (MAD) | 0.52768523 |
| Skewness | 0.96340279 |
| Sum | 212699.99 |
| Variance | 0.90128096 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.1145178764 | 1702 | 1.7% |
| 4.514983221 | 458 | 0.5% |
| 5.402044023 | 377 | 0.4% |
| 3.784350471 | 243 | 0.2% |
| 3.182034107 | 183 | 0.2% |
| 4.053054492 | 128 | 0.1% |
| 4.272895225 | 110 | 0.1% |
| 3.671437901 | 97 | 0.1% |
| 4.191737262 | 90 | 0.1% |
| 4.471387918 | 83 | 0.1% |
| Other values (93109) | 98222 |
| Value | Count | Frequency (%) |
| 0.1145178764 | 1702 | |
| 0.367216634 | 1 | < 0.1% |
| 0.3729368476 | 1 | < 0.1% |
| 0.3735259523 | 1 | < 0.1% |
| 0.3754088904 | 1 | < 0.1% |
| 0.3874966346 | 1 | < 0.1% |
| 0.3880055101 | 1 | < 0.1% |
| 0.3884958044 | 1 | < 0.1% |
| 0.3912099536 | 1 | < 0.1% |
| 0.3929158163 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 5.402044023 | 377 | |
| 5.402044023 | 4 | < 0.1% |
| 5.401914478 | 1 | < 0.1% |
| 5.40180921 | 1 | < 0.1% |
| 5.401558958 | 1 | < 0.1% |
| 5.401426672 | 1 | < 0.1% |
| 5.401373315 | 1 | < 0.1% |
| 5.401329444 | 1 | < 0.1% |
| 5.40128233 | 1 | < 0.1% |
| 5.401153131 | 1 | < 0.1% |
Interactions
Correlations
| Average_Rating | Crawled_date | Deviation of star ratings | FOG Index | Flesch Reading Ease | Helpfulness | Hotel_Name | Num_of_Ratings | Rating | Unnamed: 0 | breadth | cleanliness_score | comfort_score | depth | employee_friendliness_score | facility_score | hotel_grade | is_photo | location_score | text_length | time_lapsed | title_length | value_for_money_score | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Average_Rating | 1.000 | 0.219 | -0.013 | 0.007 | -0.047 | -0.013 | 1.000 | -0.304 | 0.276 | -0.081 | 0.040 | 0.909 | 0.876 | -0.021 | 0.829 | 0.939 | 0.515 | 0.066 | 0.151 | -0.032 | 0.035 | -0.002 | 0.694 |
| Crawled_date | 0.219 | 1.000 | 0.061 | 0.007 | 0.011 | 0.000 | 1.000 | 0.189 | 0.017 | 0.479 | 0.044 | 0.298 | 0.322 | 0.038 | 0.354 | 0.323 | 0.192 | 0.001 | 0.320 | 0.013 | 0.029 | 0.009 | 0.693 |
| Deviation of star ratings | -0.013 | 0.061 | 1.000 | -0.010 | 0.019 | -0.043 | 0.183 | -0.065 | -0.053 | 0.067 | -0.014 | 0.013 | 0.013 | -0.017 | 0.009 | 0.020 | 0.119 | 0.037 | -0.095 | 0.047 | -0.015 | -0.072 | 0.006 |
| FOG Index | 0.007 | 0.007 | -0.010 | 1.000 | -0.751 | -0.001 | 0.016 | 0.020 | 0.010 | 0.002 | 0.080 | -0.000 | 0.004 | -0.070 | 0.006 | 0.006 | 0.010 | 0.024 | 0.025 | -0.089 | 0.005 | 0.027 | -0.020 |
| Flesch Reading Ease | -0.047 | 0.011 | 0.019 | -0.751 | 1.000 | 0.023 | 0.030 | 0.009 | -0.106 | 0.000 | -0.162 | -0.033 | -0.033 | 0.127 | -0.049 | -0.042 | 0.014 | 0.026 | -0.011 | 0.286 | -0.002 | 0.012 | -0.008 |
| Helpfulness | -0.013 | 0.000 | -0.043 | -0.001 | 0.023 | 1.000 | 0.029 | -0.004 | -0.078 | -0.035 | -0.062 | -0.008 | -0.007 | 0.041 | -0.030 | -0.006 | 0.011 | 0.025 | -0.020 | 0.120 | 0.029 | 0.030 | -0.003 |
| Hotel_Name | 1.000 | 1.000 | 0.183 | 0.016 | 0.030 | 0.029 | 1.000 | 1.000 | 0.153 | 0.936 | 0.121 | 1.000 | 1.000 | 0.096 | 1.000 | 1.000 | 1.000 | 0.129 | 1.000 | 0.022 | 0.106 | 0.027 | 1.000 |
| Num_of_Ratings | -0.304 | 0.189 | -0.065 | 0.020 | 0.009 | -0.004 | 1.000 | 1.000 | -0.074 | -0.093 | -0.072 | -0.349 | -0.221 | 0.064 | -0.319 | -0.307 | 0.604 | 0.106 | 0.362 | -0.001 | 0.040 | 0.035 | -0.382 |
| Rating | 0.276 | 0.017 | -0.053 | 0.010 | -0.106 | -0.078 | 0.153 | -0.074 | 1.000 | -0.004 | 0.051 | 0.258 | 0.258 | -0.009 | 0.243 | 0.265 | 0.125 | 0.066 | 0.064 | -0.187 | -0.001 | -0.020 | 0.188 |
| Unnamed: 0 | -0.081 | 0.479 | 0.067 | 0.002 | 0.000 | -0.035 | 0.936 | -0.093 | -0.004 | 1.000 | -0.011 | -0.021 | -0.035 | 0.024 | 0.116 | -0.098 | 0.554 | 0.079 | 0.071 | -0.008 | 0.022 | 0.008 | 0.086 |
| breadth | 0.040 | 0.044 | -0.014 | 0.080 | -0.162 | -0.062 | 0.121 | -0.072 | 0.051 | -0.011 | 1.000 | 0.021 | 0.001 | -0.787 | 0.039 | 0.037 | 0.034 | 0.064 | -0.051 | -0.535 | -0.022 | -0.153 | 0.020 |
| cleanliness_score | 0.909 | 0.298 | 0.013 | -0.000 | -0.033 | -0.008 | 1.000 | -0.349 | 0.258 | -0.021 | 0.021 | 1.000 | 0.958 | -0.003 | 0.811 | 0.945 | 0.531 | 0.069 | 0.093 | -0.020 | 0.028 | 0.002 | 0.729 |
| comfort_score | 0.876 | 0.322 | 0.013 | 0.004 | -0.033 | -0.007 | 1.000 | -0.221 | 0.258 | -0.035 | 0.001 | 0.958 | 1.000 | 0.018 | 0.785 | 0.938 | 0.473 | 0.058 | 0.117 | -0.019 | 0.058 | 0.008 | 0.634 |
| depth | -0.021 | 0.038 | -0.017 | -0.070 | 0.127 | 0.041 | 0.096 | 0.064 | -0.009 | 0.024 | -0.787 | -0.003 | 0.018 | 1.000 | -0.012 | -0.019 | 0.027 | 0.046 | 0.050 | 0.426 | 0.035 | 0.133 | -0.005 |
| employee_friendliness_score | 0.829 | 0.354 | 0.009 | 0.006 | -0.049 | -0.030 | 1.000 | -0.319 | 0.243 | 0.116 | 0.039 | 0.811 | 0.785 | -0.012 | 1.000 | 0.790 | 0.426 | 0.086 | 0.105 | -0.032 | 0.034 | 0.000 | 0.664 |
| facility_score | 0.939 | 0.323 | 0.020 | 0.006 | -0.042 | -0.006 | 1.000 | -0.307 | 0.265 | -0.098 | 0.037 | 0.945 | 0.938 | -0.019 | 0.790 | 1.000 | 0.653 | 0.067 | 0.054 | -0.026 | 0.030 | -0.007 | 0.678 |
| hotel_grade | 0.515 | 0.192 | 0.119 | 0.010 | 0.014 | 0.011 | 1.000 | 0.604 | 0.125 | 0.554 | 0.034 | 0.531 | 0.473 | 0.027 | 0.426 | 0.653 | 1.000 | 0.061 | 0.433 | 0.014 | 0.055 | 0.014 | 0.321 |
| is_photo | 0.066 | 0.001 | 0.037 | 0.024 | 0.026 | 0.025 | 0.129 | 0.106 | 0.066 | 0.079 | 0.064 | 0.069 | 0.058 | 0.046 | 0.086 | 0.067 | 0.061 | 1.000 | 0.053 | 0.099 | 0.011 | 0.051 | 0.056 |
| location_score | 0.151 | 0.320 | -0.095 | 0.025 | -0.011 | -0.020 | 1.000 | 0.362 | 0.064 | 0.071 | -0.051 | 0.093 | 0.117 | 0.050 | 0.105 | 0.054 | 0.433 | 0.053 | 1.000 | -0.000 | -0.003 | 0.055 | 0.028 |
| text_length | -0.032 | 0.013 | 0.047 | -0.089 | 0.286 | 0.120 | 0.022 | -0.001 | -0.187 | -0.008 | -0.535 | -0.020 | -0.019 | 0.426 | -0.032 | -0.026 | 0.014 | 0.099 | -0.000 | 1.000 | -0.028 | 0.223 | -0.013 |
| time_lapsed | 0.035 | 0.029 | -0.015 | 0.005 | -0.002 | 0.029 | 0.106 | 0.040 | -0.001 | 0.022 | -0.022 | 0.028 | 0.058 | 0.035 | 0.034 | 0.030 | 0.055 | 0.011 | -0.003 | -0.028 | 1.000 | -0.003 | 0.062 |
| title_length | -0.002 | 0.009 | -0.072 | 0.027 | 0.012 | 0.030 | 0.027 | 0.035 | -0.020 | 0.008 | -0.153 | 0.002 | 0.008 | 0.133 | 0.000 | -0.007 | 0.014 | 0.051 | 0.055 | 0.223 | -0.003 | 1.000 | -0.007 |
| value_for_money_score | 0.694 | 0.693 | 0.006 | -0.020 | -0.008 | -0.003 | 1.000 | -0.382 | 0.188 | 0.086 | 0.020 | 0.729 | 0.634 | -0.005 | 0.664 | 0.678 | 0.321 | 0.056 | 0.028 | -0.013 | 0.062 | -0.007 | 1.000 |
Missing values
Sample
| Unnamed: 0 | Hotel_Name | Review_Text | Posted_Date | Rating | Average_Rating | Num_of_Ratings | Helpfulness | is_photo | review_title | hotel_grade | employee_friendliness_score | facility_score | cleanliness_score | comfort_score | value_for_money_score | location_score | Crawled_date | title_length | text_length | time_lapsed | Deviation of star ratings | FOG Index | Flesch Reading Ease | depth | breadth | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | studios2let | Perfect location with good connections and shops and pubs | 2024-05-01 | 10.0 | 7.6 | 11670 | 0 | 0 | Exceptional | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 1 | 9 | 215 | 2.4 | 12.49 | 62.34 | 0.404319 | 2.930427 |
| 1 | 1 | studios2let | The room had everything you needed. Near to amenities, was good room for price just needs little updatingThe bed was so hard it felt like sleeping on a hard floor, you had to make sure you had something on your feet as flooring pinched you feet needs changing | 2024-12-02 | 8.0 | 7.6 | 11670 | 0 | 0 | Very good | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 2 | 48 | 0 | 0.4 | 10.43 | 80.96 | 0.550483 | 1.213568 |
| 2 | 2 | studios2let | Conveniently nearby St. Pancras, very small but clean and pleasant room (first floor with small balcony to street side). Interesting area.Luggage service can be improved by offering to lock luggage up instead of it just being put into the hall with all risks on the guests. | 2024-12-01 | 8.0 | 7.6 | 11670 | 0 | 0 | Convenient location | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 2 | 46 | 1 | 0.4 | 7.86 | 72.87 | 0.593700 | 1.601652 |
| 3 | 3 | studios2let | Reception staffed 24 hours a day.All good. | 2024-12-01 | 9.0 | 7.6 | 11670 | 0 | 0 | Peaceful position in an elegant street close to 3 major stations and the Bloomsbury area. | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 15 | 7 | 1 | 1.4 | 8.51 | 81.29 | 0.343037 | 2.708736 |
| 4 | 4 | studios2let | Very convenient to King’s Cross and the cityA little dated could do with a lick of paint | 2024-11-30 | 8.0 | 7.6 | 11670 | 0 | 0 | Great little gem in the city centre | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 7 | 17 | 2 | 0.4 | 9.15 | 88.06 | 0.705426 | 1.030207 |
| 5 | 5 | studios2let | Located in a quiet area but close to Kings Cross station so getting around was easy. Several little pubs nearby for dining and some good coffee shops too.There is no lift so dragging a heavy suitcase up and down stairs was challenging. We had booked a room with terrace but the outdoor space was really minuscule - not what we had expected from the photos. | 2024-11-30 | 7.0 | 7.6 | 11670 | 0 | 0 | Convenient, quiet location. | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 3 | 65 | 2 | 0.6 | 8.90 | 72.16 | 0.493165 | 1.671327 |
| 6 | 6 | studios2let | It's spacious, good value and so very quiet for London.You sometimes have to wriggle the loo flusher to stop it running and running | 2024-11-30 | 9.0 | 7.6 | 11670 | 0 | 0 | Superb | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 1 | 23 | 2 | 1.4 | 4.60 | 76.72 | 0.440611 | 1.477398 |
| 7 | 7 | studios2let | LocationLot of stairs (bad knee) | 2024-11-29 | 9.0 | 7.6 | 11670 | 0 | 0 | Ideal location for travelling round | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 5 | 5 | 3 | 1.4 | 10.00 | 66.40 | 0.412727 | 2.357962 |
| 8 | 8 | studios2let | Location was great, so near the stationWe were on the top floor, six flights of stairs and no lift.\nHeating was on 24:7 full temperature and no means of reducing it! | 2024-11-29 | 7.0 | 7.6 | 11670 | 0 | 0 | Perfect location, | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 2 | 31 | 3 | 0.6 | 11.36 | 81.12 | 0.508956 | 1.917493 |
| 9 | 9 | studios2let | The location which is excellent for public transport and local dining. \nFriendly staffed reception where we could leave our travel bags all day after checking out.The climb up 3 flights of stairs was exhausting but it was our choice.\nIt was a small room and the kitchen facilities were very sparse ( but we didn't need them) | 2024-11-28 | 8.0 | 7.6 | 11670 | 0 | 0 | Ideal accommodation for a short stay in London near St Pancreas station | 3 | 8.3 | 7.5 | 7.9 | 7.8 | 7.6 | 9.3 | 2024-12-02 | 12 | 57 | 4 | 0.4 | 9.17 | 74.19 | 0.778746 | 1.675657 |
| Unnamed: 0 | Hotel_Name | Review_Text | Posted_Date | Rating | Average_Rating | Num_of_Ratings | Helpfulness | is_photo | review_title | hotel_grade | employee_friendliness_score | facility_score | cleanliness_score | comfort_score | value_for_money_score | location_score | Crawled_date | title_length | text_length | time_lapsed | Deviation of star ratings | FOG Index | Flesch Reading Ease | depth | breadth | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 101683 | 103553 | montanahotel | Just the locationVery poor facilities . Rooms are pretty old and the toilets are not functioning at the best . Breakfast was also very basic . If you need an English breakfast you need to upgrade and pay extra and it’s absolutely not worth it . Hash brown being the most basic was ridiculous. They really need to improve a lot . The only thing was staff was cooperative and helpful . | 2023-09-18 | 3.0 | 7.8 | 6248 | 0 | 0 | 나쁨 | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 1 | 72 | 455 | 4.8 | 6.79 | 70.39 | 0.616648 | 1.496253 |
| 101684 | 103555 | montanahotel | It was good enoughDelayed check in, cracked basin in bathroom, water pressure poor. Everything else was fine. | 2023-07-11 | 6.0 | 7.8 | 6248 | 0 | 0 | Not a bad place to stay if it’s where you need to be. | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 13 | 17 | 524 | 1.8 | 3.40 | 79.77 | 0.527146 | 2.300939 |
| 101685 | 103556 | montanahotel | locationhot room, shower didn’t drain, broken sink | 2023-07-01 | 5.0 | 7.8 | 6248 | 0 | 0 | Great location and generally clean spot but the place is a bit a dated and the basement room was damp, hot and a bit mus | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 25 | 7 | 534 | 2.8 | 8.51 | 38.99 | 0.174663 | 2.642188 |
| 101686 | 103557 | montanahotel | Good to have tea/coffee and a fridge in the room.The building is beautiful but the interior decor leaves a lot to be desired. The hotel is Indian in style...red/gold...faded wallpaper and threadbare carpets...the room was OK but again needed updating. We were in the basement with no view...only rubbish out of the window. There was no hot breakfast so only cereals, fruit and pastries...but it was in a pleasant location...near the tube and shops/pubs etc...only a short walk to the Natural History museum. | 2023-04-25 | 6.0 | 7.8 | 6248 | 0 | 0 | Lovely building with quite a grand entrance...let down by the interior...fine for overnight stay. | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 14 | 83 | 601 | 1.8 | 6.37 | 63.86 | 0.681052 | 1.021631 |
| 101687 | 103558 | montanahotel | locationwater pressure was non existent.\ndespite several request to address the problem. | 2023-03-25 | 3.0 | 7.8 | 6248 | 0 | 0 | while the staff was nice. They did very little to remedy the lack of shower and hot water problem we had . | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 22 | 12 | 632 | 4.8 | 9.07 | 23.09 | 0.517236 | 2.098524 |
| 101688 | 103559 | montanahotel | Convenient and classy. The staff are excellent people, and Light of India is a fantastic restaurant. I would certainly stay again.N/A | 2022-12-28 | 10.0 | 7.8 | 6248 | 0 | 0 | Highly recommend this little gem situated in my favourite part of town. | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 12 | 21 | 719 | 2.2 | 8.51 | 64.37 | 0.609116 | 1.975976 |
| 101689 | 103560 | montanahotel | lovely atmosphere, extremely friendly and helpful staff. | 2022-07-01 | 10.0 | 7.8 | 6248 | 0 | 0 | Perfect location for our visit to the Royal Albert Hall and the Natural History Museum. would | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 16 | 7 | 899 | 2.2 | 14.23 | 30.53 | 0.187846 | 3.066338 |
| 101690 | 103561 | montanahotel | It was a single room, a little small but it was fine for 1 person, it had everything I needed | 2022-06-28 | 10.0 | 7.8 | 6248 | 0 | 1 | The staff were very friendly and helpful. The position was perfect for sightseeing | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 13 | 20 | 902 | 2.2 | 8.00 | 76.56 | 0.328068 | 2.343981 |
| 101691 | 103562 | montanahotel | Very clean and well maintained.The rooms are very nice and comfortable with staffs professionalism.The food are delicious,nice breakfast,lunch ,dinner and the cocktails are exceptional.Notting much just that there’s no parking. | 2022-02-16 | 10.0 | 7.8 | 6248 | 0 | 1 | Myself and my wife really enjoy our stay at this hotel,we love the service and all the staffs are amazing.Looking forwar | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 21 | 30 | 1034 | 2.2 | 8.33 | 63.86 | 0.535681 | 2.192362 |
| 101692 | 103563 | montanahotel | The staff were very friendly and helpful! Especially Kampas The hotel was very clean and the fact that they had a wonderful Indian restaurant as part of it was amazing. Best Vindaloo ever!!!!Shower a tad small but adequate xx | 2022-02-06 | 10.0 | 7.8 | 6248 | 0 | 1 | Loved every minute! we will be back!! Xxx | 3 | 9.0 | 7.7 | 8.2 | 8.2 | 8.0 | 9.4 | 2024-12-16 | 8 | 39 | 1044 | 2.2 | 5.97 | 69.99 | 0.586610 | 1.649065 |